An algebraic approach to Integer Portfolio problems 

F. Castro J. Gago I. Hartillo J. Puerto J.M. Ucha 



Abstract 

Integer variables allow the treatment of some portfolio optimization problems in a more 
realistic way and introduce the possibility of adding some natural features to the model. 

We propose an algebraic approach to maximize the expected return under a given 
admissible level of risk measured by the covariance matrix. To reach an optimal portfolio 
it is an essential ingredient the computation of different test sets (via Grobner basis) of 
linear subproblems that are used in a dual search strategy. 

1 Introduction 

Mean-variance portfolio construction lies at the heart of modern asset management and has 
been among the most investigated fields in the economic and financial literature. The classical 
Markowitz's approach, cf. |Mar52] . or |MarOO] for a recent reissue of his work, rests on the 
assumption that investors choose among n risky assets to look for their corresponding weights 
Wi in their portfolios, on the basis of 

1. previously estimated expected returns /ij, and 

2. the corresponding risk of the portfolio measured by the covariance matrix VL. 
Portfolios are considered mean-variance efficient in two senses: 

• If they minimize the variance for a given admissible return R: 

(MVPl) min ( u;i ... Wn)^ 



subject to ^iWi + ■ ■ ■ + ^nWn > -R, 
Wl H h = 1, 



Wj e 



If they maximize the expected return for a given admissible risk (variance) r^: 

(MVP2) max fiiWi + ■ ■ ■ + /U„w„, 

subject to [ Wi ... Wn ) ^ 



Wl H \-Wn = I, 

Wi e M. 



\Wn J 
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In problems (MVPl) and (MVP2) the weights Wi stand for the percentage of a given asset 
in the portfoho. It is well-known that both problems give dual views of a common analysis 
since they correspond to two scalarizations of the more general bicriteria approach: obtaining 
the entire set W of mean- variance efficient portfolios. In this sense, (MVPl) and (MVP2) are 
equivalent in that W can be obtained either solving (MVPl) or (MVP2) parametrically on R 
(admissible returns) or r (admissible risks), respectively; see |SNT85] for further details. 

Although the standard statement of the mean-variance portfolio problem uses continuous 
variables, there are different reasons to consider integer variables, as it is pointed out in different 
authors [SMOSl ILTOI IBLCITI IYou98l[CFn7j . Our goal is to treat problem (MVP2) in its integer 
version based on the following considerations: 

• In real markets, we can buy or sell only an integer amount of assets, so considering real 
weights in the portfolio is actually an approximation. As it is well known already for linear 
problems, the rounding of the real values obtained for one optimal portfolio with continuous 
weights may produce, in principle, an infeasible solution or a very bad approximation 
to the optimal integer solution. We think that this is indeed the case for a portfolio 
that potentially considers future contracts, as in |GK08j for commodity futures, to obtain 
lower correlations concluding on the benefits of diversification. We show this behaviour in 
Example 15.21 

• The need to diversify the investments in a number of industrial sectors |BL07j . 

• The constraint of buying stocks by lots |BL07l ICF071 IJHLMOlj . 

• Another reason for the use of integer variables, usually binary, appears in practice because 
portfolio managers and their clients often hate small active positions and very large number 
of assets, the reason being that they produce big transaction and monitoring costs. Hence 
it is rather usual to add the constraints associated to the following conditions: 

1. there should be at least some previously decided minimum percentage (lower bounds), 
and 

2. there should be a maximum number of assets (upper bounds). 

Furthermore, transaction costs are included as linear constraints in the return equation, 
using decision variables, as in |LT08] . 

The above mentioned requirements enrich any portfolio model for real applications and re- 
quire 0-1 or, more generally, integer variables. Nevertheless, as far as we know there are no 
specific algorithms to solve these problems. Note that methods in semi-definite programming 
deal with this problem, but they are oriented to the continuous case. See also |SM05j for a set 
of routines that handles these problems in the continuous case. In our approach, it is natural to 
consider non negative integer variables Xi, . . . ,a;„ for the quantities of each product. Then it is 
necessary to take into account 

• the unit prices ai, . . . , a„ of the products, 

• the expected returns of each stock /ii, . . . , 

• and the total available budget B. 
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Then Problem (MVP2), in its integer version, can be restated as 



subject to 



"^aX Ya=1 l^i^i = 

Q{x) < B^r'^, 

X e (Z+)-, 



where 




Q{x) = ( aixi 



n 



) n 




1.1 Previous works and our approach 

There have been several works with different techniques to treat Problem (MVPl) in an integer 
framework. 

• For (MVPl): piecewise linear approximation in |Sha71] and |Sto73] . absolute deviation 
selection in |YK91j . branch- and- cut techniques in |Bie96j and minimax selection approach 



• For (MVPl) with additional transaction cost: dual Lagrangian relaxation in |MM91l 
IGK87j . linear terms transformation in |LC98j . separable terms transformation in [Sha71l 
ISto73j . and parallel distributed computation reformulating the objective function into an 
ellipse and using piecewise linearization in |LT08] . 

For Problem (MVP2) the literature is not so wide, although this approach is very usual in 
practice for the so called benchmark portfolios, as used in |MM08] . The results of |BPTOO] have 
been in some sense a milestone in applying tools of Algebraic Geometry (namely Grobner bases) 
to optimization, although this method is not effective for Problem (MVP2). See [AL94[ Ch.1,2], 
|GLO05l Ch. 2], or |BW05j for introductions to this subject. 

Our goal is to present a new algorithm to deal with portfolio problems with integer variables 
and non-linear constraints. The method is based on the computation of some test sets using 
Grobner bases. These bases are computed from a linear integer subproblem that contains the 
original linear constraints together with some new cuts induced by the non-linear constraints. 
The use of the reversal test-set allows us to design a dual search algorithm that moves from the 
optimal solution of the linear subproblems towards the optimal solution of the entire portfolio 
problem. Our approach is new with respect to the cited references, and we will see in the last 
section that it is effective to deal with portfolios with number of stocks comparable to those in 
the literature (see [UF07llLT08] l 

The paper is organized as follows. In Section |5] we fix the formulation of the problem to be 
treated with the method explained in |TTN95] . In Section [3] the successive additions of linear 
constraints are explained, and the dual search algorithm based on a test-set computed from a 
Grobner basis is applied to find an optimal point of Problem ([1]). It is also included an illustrative 
example. 

Section H] explains the existence of a lower bound to the risk value below that it is 
not necessary to invest the whole budget to get an optimal portfolio. Section [5] contains some 
computational experiments and Section [6] draws some conclusions on the paper. 



in |You98j . 
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2 Preliminaries 



If one accepts the integer version of model (MVP2) to obtain efficient portfolios, the objective 
function and all constraints but one — that is quadratic — are linear. This is related to the 
form of the problem treated in |TTN95] . To solve an integer programming problem (P) with 
linear objective function under different linear and nonlinear constraints, the following general 
approach can be applied (see [TTN95] for a complete example): 

1. First obtain a test-set for a linear part of (P), let us call it (P'). A test-set T is a set 
of vectors in Z*^ such that, given a feasible point F of (P'), if none of the feasible points 
obtained adding the elements of T to F improves the value of the objective function, then 
F is an optimum of (P')- A test-set of a linear integer problem can be obtained via Grobner 
bases ( |CT91] . see |Stu96l ch. 5] for a modern introduction). 

The best known way of obtaining these bases is using programs as 4ti2 ( |ttU8j ). that 
takes advantage of the special structure of the toric ideals corresponding to linear integer 
problems. Programs for general Grobner bases are not so good for big examples. 

2. Starting at the optimum of the linear problem (P'), which is possibly an infeasible point 
for problem (P), use the reversa/ test-set — so decreasing the objective function each time 
a vector of T is applied — to move throughout the set (tree) of feasible points of (P') until 
you obtain feasible points for the whole problem (P). In our case, it means, portfolios with 
admissible risk. If this happens, one can prune the remaining feasible solutions. 

Our approach consists of applying this general idea to Problem ([1]) mixing it with some 
bounds obtained from the continuous relaxation of the problem, to reduce the feasible region 
described by the linear constraints, as in |LT08j . 

The initial problem is 

(P) max{/i*a; | a^x < B,Q{x) < B^r^,x G (Z+)"}, 
and its linear relaxation is 

(P') max{/i*a; \ a^x < B,x e (Z+)''}. 

The purpose of the following section is to explain how to obtain additional linear constraints 
to improve the accuracy of the linear description of problem (P), taking advantage of geometrical 
properties of the definition of risk. We look for some linear constraints based on the convexity 
of the hyperquadric given by the covariance matrix Q, which is symmetric positive definite. 
However, too many constraints means too many elements in the Grobner basis, so the point 
will be to find the precise trade-off between constraints to eliminate unnecessary points in our 
searching, and at the same time not to increase the basis unnecessarily. 

3 A dual search algorithm based on a test-set 

A direct approach to problem (P) following [TTN95] may lead to a non practical procedure to 
find the optimum. If we compute a test-set related to problem (P'), and move along the set 
of solutions of the linear relaxation of problem (P), the number of points to be processed is 
huge, even for a small number of variables. The main drawback of the procedure is the great 
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number of integer solutions to be visited between the starting point and the feasible region of 
the problem (P). In order to avoid this enormous enumeration, some cuts can be added, using 
the convexity of the hyperquadric defined by the symmetric positive definite matrix Q. 

We assume that a black box is available providing solutions to linear continuous optimization 
problems with quadratic convex constraints, as the function fmincon in Matlab, the different 
implementations of semi-definite programming compared in |Mit03] . or even the linear time in 
fixed dimension algorithm by |Dye92] . 

3.1 Starting tasks 

In order to improve our representation of problem (P), we proceed as follows: 

• The first step is the computation of the continuous solution Uc of the problem 

max{/i*a; | a*£c < B,Q{x) < B^r^,x G 

which gives us a return /x^itc = Rc- Clearly the discrete return is less than or equal to 
[i?cj (function ComputeContinuousOptimum in Algorithm [1]) . 

• Secondly, we need a good discrete feasible point, which will give us a lower bound for the 
return. The problem (P) is always feasible, because the origin belongs to the region, but 
this point it is not very useful. The rounded point = [UcJ is not always feasible, as it 
is well-known. 

• In order to get a feasible starting point, it is possible to decrease each coordinate until we 
get a feasible point. After that, the point can be improved so that the return cannot be 
increased in any direction inside the feasible region (function ComputeDiscreteApprox in 
Algorithm [1]) . 

From this point Pe we will reach the discrete optimum. Let Re = fJ'^Pe be the return associated 
with the discrete feasible point Pe- A new valid formulation of the problem is 

max{/x*£c I a^x <B,Re< M*£c < [i?J,Q(x) < fiV^ a; G (Z+)''}. 

3.2 Addition of new linear constraints 

From the above formulation, we improve our description of problem (P) in two ways. 

1. Adjusting hyperplanes to the hyperquadric given by upper and lower bounds on the vari- 
ables. 

To this end, for j = 1, . . . , n, we solve the continuous problems (function ComputeLowerBounds 
in Algorithm [1]) 

mm{xj I a*£c <B,Re< < [Rc\,Q{x) < B'^r'^,x G (M+)"}. 

The above minimum values give us an integer lower bound bj for each variable xj , applying 
the ceiling function. The constraints bj < Xj are not going to be involved in the computa- 
tion of the test-set through the Grobner basis. This is because we can write bj + Uj = Xj, 
where i/j > 0, and this change of variables do not alter the coefficient matrix of the linear 
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cuts, nor the linear cost function. Since the computation of the Grobner basis does depend 
only of this matrix, there is no extra computation time. 

In a similar way, the maximization problems 

max{xj I a^x < B,Re< n^x < \_Rc\,Q{x) < B'^r^^x e (M+)''} 

for j = l,...,n, provides us upper bounds. However, these linear constraints highly 
increase the size of the test-set. We will only use them to fix variables, because the upper 
and lower bounds of some variables are equal in many examples. This fact allows us 
reducing the dimensionality of the problem. 

We consider the polytope 

P = {x \ a^x < B, fi^x < [R^\ , fi^x > Re,b < x} 
where b < x stands for the conditions bi < Xi,i = 1, . . . ,n. 

2. Adding nearly tangent hyperplanes to shrink the polytope. 

The main idea is the addition of cuts so that the farthest regions of the polytope P could 
be cut off. To do that, we use a point of P where the function Q takes its greatest value. 
This is equivalent to solve the continuous problem 

max{Q{x) \ x E P}. 



It is coded as function ComputeMaxRisk in Procedure NewPolytope Note that this problem 
can be efficiently solved since it is of polynomial complexity |KTH79] . 

Let Pmax be a solution of Problem ([2]), s be the half-line from the feasible point pe to Pmax, 
and p' = QOs the intersection point of s with the hyperquadric Q defined by the function 
Q{x) = B^r\ 

Let H be the supporting hyperplane to Q at the point p'. By the convexity of Q, the 
hyperplane H defines a linear half-space that contains the interior of Q. 

The coefficients of the hyperplane H are real numbers, so its normal vector n may have 
non integer components. However, we are looking for linear constraints with integer coef- 
ficients, so we round the vector n to an integer vector n G (variable Prec in Procedure 



NewPolytope] ). Then we proceed to compute the independent term c of the tangent hyper- 
plane to Q whose normal vector is equal to n, and such that the half-space n*cc < c = \c\ 
defines a linear half-space which contains the interior of Q. 



This process can be iterated as many times as we wish (Algorithm NewPolytope), shrink- 
ing the polytope P. Nevertheless, there should be a trade-off between the number of new 
hyperplanes and the size of the Grobner basis associated with the system, so the maxi- 
mum number of cuts allowed is a parameter of the algorithm. Additionally, the difference 
between = msix{Q{x) \ x G P} and the initial risk Tq is another stopping criterion, 
passed as the parameter Tol to the algorithm. 
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3.3 Dual iterations with the test-set 

Now after the above two phases Problem [1] is transformed to 

max fi^x 
s. t. a^x < B, 

Re < t^'x < [R,\, 

TXj^'X ^ C^, k — 1, . . . , 5, 

x>b,xe (Z+)'^, 
Q{x) < B^r^, 

where hl.x < Ck,k = 1, . . . , s are the new cuts. The test-set G is associated with the hnear 
problem 

max fi^x 
s. t. a^x < B, 

x>b,x e 

The condition Re < ^i^x is tested inside the tree-search, and used to prune leaves. The value of 
Re is updated as soon as a new feasible point with a better return is found. 

Once we have the test-set of the above polytope we proceed with the resolution method. The 
main bottleneck of our approach is the search over the tree defined by the test-set. The number 
of points to be processed is strongly related to the initial feasible point Pe found because: 

1. The estimated return Re defines the lowest facet of the polytope P in terms of the objective 
value. 

2. The upper and lower bounds for the variables Xi are strongly determined by Re- The closer 
is Re to the optimal value, the narrower is the interval for each variable Xi. 

On the other hand, to apply the reversal test-set search we need an initial (and usually 
non feasible) point, but not far from feasibility. We take the point Pbounds built by considering 
the independent terms of the linear constraints of Problem ( 13. 3p . after applying the translation 
bi + yi = Xi, i.e., 

Pbounds = {b,B - a%, [Re\ - 4 - n*6)*. 

The starting point for the reversal test-set tree search is the point pini, the reduced of Pbounds by 
the test-set G. This is the solution of the linear problem (13. Sp . as shown in |CT91j . With the 
reversal test-set, we search over the tree of nodes (feasible points for the linear problem) until 
we obtain a feasible point for the entire problem, including the quadratic constraint (function 
TreeSearch in Algorithm [1]). If the switch SwFictBounds is set to true, the search is stopped 
in the first point that improves the estimated return given by the incumbent point Pe- 

Although the tree search has to end, a maximum number of processed records is passed to 
the procedure TreeSearch as a parameter. It could happen that the number of points processed 
exceeds the maximum allowed, and the optimum had not been found. If a new feasible point p'^ 
is found {Swimprove = true), the bounds can be recomputed. The test-set is still valid for the 
new search. The only new computations are the independent terms of the hyperplanes and the 
reduction to find the starting point. 

However, if the test-set were huge, it would be better to compute the new linear cuts given 
by the new estimated point p'^ and its associated Grobner basis. In general, the Grobner basis 
will be shorter, and the elapsed time spent in the tree search will be shortened. 
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3.4 Restricted search in a region 



In the case that a new feasible point is not found after a given number of processed points 
[Swimprove = false), it is then possible to apply a branch- and- cut technique with the bounds. 
Indeed, let Pf, be the feasible point that gives the value R^, and b the vector of lower bounds for 
the variables Xi,i = 1, . . . ,n. Compute a point b' = b + a{pe — 6), < a < 1 (usually a = 1/2), 
which we call fictitious bound, and consider the following problem: 

max ^^x 
s. t. a^x < B, 

Re < < 
"^k*^ — ^'t' k — 1, . . . , 5, 
x>b',xe {'L+Y, 
Q{x) < B^r\ 

Solving the above problem, we expect to find a new feasible point to relaunch the search process. 
The idea is simple: the search is restricted to a smaller region, but the solution of the original 
problem is not guaranteed to be in that region. It is a heuristic technique to take advantage 
that this new problem does not need a new test-set. In our implementation, this search can be 
launched by setting the switch SwFictBounds equal to true. The process is stopped as soon as 
a new point is found, and then we start again. If no point is found after the maximum number 
of allowed nodes (variable MaxNumNodes in Algorithm [1]) , then we stop, and the point Pe is 
our best value. 

The pseudocode of the main algorithm is described in DiscreteOptimum. The procedure 
NewPolytope presents the pseudocode of the strengthening of the polytope P by adding valid 
cuts. The switch SwEOP is used to mark the end of the process. 



3.5 An illustrative example 

Let a = (6075,3105)* be the vector of prices, and n = (12500, 10000)* the vector of returns, 
with the covariance matrix equals to 

/ .8328436 - 4 .485325e - 4 \ 
^.4853256- 4 .651298e - 3 )' 

Let B = 9 X 10^ be the budget, and = 3 x 10^^ the fixed risk. The continuous optimum is 
= (772.754778,215.028056)*, with a total return R^ = 11809715.29. Then [R^\ = 11809715, 
and rounding Uc we get the point Ud = (773, 215), which is not a feasible point. Subtracting from 
the components, we eventually reach a feasible point Pe = (773,214), whose associated return 
is Re = 11802500. The lower bounds bi and 62 are now computed, solving first the continuous 
problems 

min{xi I a*a; <B,Re< n^x < R^Qix) < r'^B'^,x G (M+)^}, 

where 

Qix) = ( aixi a,X2 ) ( \ . (2) 

y 02X2 J 

The respective continuous values are 61 = 752.69, rounded to h\ = 753, and 62 = 190.58, rounded 
to 62 = 191. We want to solve the problem 

max{fi^x I a*a; <B,Re< fi^x < [Re\,Q{x) < r^B^,x >b,xe (Z+)^}. 
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In the associated linear problem 



max {n^x I fi^x + Zi = [Rc\ , a^x + Z2 = B,x>b,xE (Z+)^} 



we change the variables Xi = yi + bi, and the resulting linear problem has the same coefficient 
matrix. The computation of a Grobner basis leads to the test-set formed by vectors 



i;i = (-4,5,8775,0)*, V2 



= (1, -2, 135, 7500)*, V5 = (2, -3, -2835, 5000)* 
Now, we reduce the point 

Pbounds = (^1, B - aibi - 02^2, [Rc\ " AH&i 



1, 1, 2970, 2500)*, ^3 = (0, -1, 3105, 10000)*, 
V6 = (3,-4,-5805,2500)*. 



/i2&2)* 



(753,191,3832470,487215)* 



with the test-set to get the linear problem optimum, which is the starting point p-mi of the tree 
search. In this case, Pmi = (791,192,3598515,2215). We now show the path followed by the 
search procedure in the tree of solutions of the linear part of the problem. In each node, we write 
the distance Ai to the continuous return associated with it, that is, Ai = [Rc\ — i-i\p + Vi). 
The initial distance is Ag = 7215, the difference between \_Rc\ and Re- The larger the value, the 
smaller the return. Therefore, values larger than Ag means that the corresponding branch can 
be pruned. The list of nodes to be processed are then ordered by Ai. Note that black dots '•' 
mean pruned nodes, and white dots 'o' mean new nodes. The points are shortened to the two 
first components to save space. 



Node p 


= (791, 192)*, Ae 


= 7215. Leaves p + t^i, z = 1, . . . , 6: 




- P4 


- Vi = 


(787, 197)* 


Ai 


= 2215, > Tq. New node o. 




- P4 


- V2 = 


(790, 193)* 


Ai 


= 4715, > Tq. New node o. 




- P4 


- V3 = 


(791,191)* 


Ai 


= 12215 > Ae. Pruned by A^. 




- P4 


- ■U4 = 


(792, 190)* 


Ai 


= 9715. Pruned by bound 62 = 


191 


- P4 


- V5 = 


(793, 189)* 


Ai 


= 7215. Pruned by bound 62 = 


191 


- P4 


- Vq = 


(794, 188)* 


Ai 


= 4715. Pruned by bound 62 = 


191 



The above information gives rise to the following descendants. 
0(791, 192) ^•(794,188) 




o(787,197) o(790,193) .(791,191) 

List of ordered nodes: {(787, 197)*, (790, 193)*}. 
^ Node p = (787, 197)*, Ag = 7215. Leaves p + Vi,i = 1, . . . ,6 

- p + vi = (783, 202)*, Ai = 2215, > rg. New node o. 

- p + V2 = (786, 198)*, Ai = 4715, > rg. New node o. 

- p + V3 = (787, 196)*, Ai = 12215 > A^. Pruned by A^. 

- p + Vi = (788, 195)*, Ai = 9715 > A^. Pruned by A^. 



(792, 190) 



► (793,189) 
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- p + V5 = (789, 194)*, Ai = 7215 = A^. Pruned by Ag. 

— p + vq = (790, 193)*. Already in the list of nodes to be processed. 

The corresponding diagram is displayed as 
o(787,197) ^.(790,193) 




o(783,202) o(786, 198) .(787,196) .(788,195) 

List of ordered nodes {(783, 202)*, (786, 198)*, (790, 193)*}. 
^ Node p = (783, 202)*, A^ = 7215. Leaves p + Vi,i = 1, . . . ,6: 



► (789, 194) 



(779,207)* , Ai = 2215 < Ae,r^ < Tq. Feasible point, and improvement. 



P+Vi 

Update Ae = 2215. 
p + V2 = (782,203)*, Ai 
p + vs = (783, 201)*, Ai 
p + Vi = (784,200)*, Ai 
p + V5 = (785, 199)*, Ai 
P + vg = (786, 198)*, Ai 



4715 > Ag. Pruned by Ae. 
12215 > Ae. Pruned by A^. 
9715 > Ae. Pruned by Ae. 
7215 > Ae. Pruned by Ae. 

4715 > Ae. Deleted of the list of nodes to be processed. 



The tree representation is 

0(783,202)^:^ .(786, 198 




(783,201) 



.(784,200) 



►(785, 199) 



List of ordered nodes {(790, 193)*}. 



-k Node p = (790, 193)*. This node is pruned because Ai > Ae, so the process is finished. 
The total number of processed nodes is 4. 

The optimum is (779, 207)*, with a difference return of 2215 units from the continuous solution. 
The initial estimated point was at 7215 units. This example shows a large difference between 
the rounded solution and the discrete optimum. 

Although this example is very simple, we will now show the improvement that we get using 
the cuts. The initial polytope P is defined by 

P = {xeW \ a^x < B,Re< At*a; < [Rc\ ,x>b}. 

The first maximum of the quadric Q{x) described in Equation ([2]) on the polytope is Pmax = 
(753, ^2000^ )* tangent line at the intersection point has as normal vector (0.54452, 0.45547)*. 

Rounding to three digits, the new normal vector of the hyperplane is (545,455)*, and the in- 
dependent term is 519113.7265. Then the first cut is 545xi + 455x2 < 519114. Adding this 
inequality to the polytope P, and repeating the process, we have the new maximum at point 



10 



Pmax = (^^^; 191)*, and the normal vector of the tangent cut is equal to (0.56698,0.43301)*. 
Then the new cut is 567xi + 433x2 < 531402. 

The linear problem, with the slack variables considered, is 



max X 
s. t. + zi = [Rc\ , 



a*a; + Z2 = B, 
X > 6, 

545x1 + 455x2 + ^3 = 519114, 
567x1 + 433x2 + = 531402, 



and the test-set associated with it has 7 elements: 



(-4, 5, 8775, 0, -95, 103)*, W2 = (-1, 1, 2970, 2500, 90, 134)*, 

(0, -1, 3105, 10000, 455, 433)*, 104 = (1, -2, 135, 7500, 365, 299)*, 

(2, -3, -2835, 5000, 275, 165)*, Wq = (3, -4, -5805, 2500, 185, 31)*, 
(7, -9, -14580, 2500, 280, -72)*. 



The starting point Pini is the reduction of the point 



Pbounds 



(753, 191, 3832470, 487215, 21824, 21748)* 



with the test-set. In this case, pi^i = (779, 207, 3624840, 2215, 374, 78)*, and it is already the 
optimum point. 

In this example, there is a large amount of money (value 3624840) that is not invested. The 
explanation of this counterintuitive behaviour is because the risk is below a critical threshold 
rf. The way to compute this threshold rf is the goal of the next section. 

4 A remark on admisible risks 

In Problem (MVP2), there exist values of the risk where the optimal investment does not 
exhaust the available budget, i.e., the linear constraint a^x = B is not active in the optimal 
solution. Furthermore, there exists a value [border risk) below that it is not necessary to 
invest the whole budget to get the optimum. 

The main idea is that the optimum of a linear function with a quadratic constraint Q{x) < 
B'^r^, Q symmetric positive definite matrix, is found at the point on the quadric whose tangent 
hyperplane is parallel to the vector given by the objective function. The only problem is dealing 
with the negative components that the point could have. 

Proposition 4.1. In Problem (MVP2), there exists a risk such that if < r^, then the 
optimal investment does not need to invest the overall budget. 

Proof. Given a quadric C with matrix 



and V the normal vector of the hyperplane v*x = 0, we are looking for a point p & C such that 
the hyperplane 



Q 



o^o Qoo 



) 



Qoo symmetric positive definite matrix. 



(1 
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is parallel to v^x = 0. Then ao + QooP = At; for certain A G M, and applying that p belongs to 
C we get 

There are two solutions in A, and we hold the positive one. In the case of Problem (MVP2), 
we have ao = 0, apo = —r^B^, and Qoo = C = DQD, where D is the diagonal matrix 
diag(ai, . . . , a„). The quadric C is centered at the origin, the vector v is now /x, and 

T B T B 

A = — , and the tangent point is pt = ,„ , — C^^ii. 

This point could have negative components, which means that it is outside of the feasible region 
of the problem. Let J be the set of indexes j such that the j-th component of vector C~^ij, is 
positive. Let Cj be the hyperquadric restricted to the intersection of hyperplanes Xj = 0, j G J, 
and fij the vector of components f-ij with j G J. The restricted hyperquadric is centered at the 
origin, an the optimum of the restricted problem is reached at 

rB 

qt = aCj^fXj, where a = 



The total amount of invested money is the dot product qt* aj, and it must be less than B: 



qt»aj = =ajCj fij < B. 



Then 



2 MjCj Vj ^ 2 



□ 

It is worth noting that does not depend on the initial budget B. 



5 Computational results 

This section illustrates the use of our approach in solving some integer portfolio problems with 
data taken from the literature. In doing that, we have implemented Algorithm [1] in SciLAB, to 
get a portable code, in an Intel Core2 Duo CPU, 2.53 GHz, and 3 GB of RAM (code is available 
upon request for comparison purposes). The first example solves an actual problem with 44 
stocks, whereas the second one shows the big sensitivity of these models with regard to the use 
of rounded solutions from the optimal solutions of the relaxed (continuous) formulation. 

Example 5.1. This example illustrates the use of our methodology with actual data taken from 
44 stocks indexed in Eurostoxx, from January 2003 to December 2007. The vector of initial 
prices is given by the prices of the stocks on January 3rd 2008 (see Tabled]), and the returns are 
estimated from the monthly historical data. 
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Ticker 


Price 


Return 


Ticker 


Price 


Return 


Ticker 


Price 


Return 


cLCcl.pcL 


22.7 


2.8 


agn.as 


12.0 


0.8 


ai.pa 


101.6 


12.4 


alv.de 


144.9 


32.6 


bas.de 


100.9 


24.9 


bay.de 


61.6 


23.0 


bbva.mc 


16.6 


2.9 


bibe.mc 


10.1 


2.4 


bn.pa 


61.2 


10.9 


bnp.pa 


73.1 


11.0 


ca.pa 


52.2 


5.4 


cs.pa 


27.0 


5.6 


dai.de 


64.7 


13.5 


dbl.de 


128.6 


45.8 


dbk.de 


87.8 


17.8 


dg.pa 


48.7 


14.1 


dte.de 


14.9 


1.3 


enel.mi 


8.1 


0.7 


eni.mi 


25.1 


3.4 


eoan.de 


143.8 


37.8 


for a. as 


18.2 


1.9 


fp.pa 


56.6 


7.8 


fte.pa 


24.5 


1.5 


g.mi 


30.6 


2.2 


gle.pa 


97.6 


15.8 


gsz.pa 


39.5 


7.9 


ing.as 


26.1 


5.1 


isp.mi 


5.3 


1.3 


Ivmh.pa 


82.0 


13.9 


muv2.de 


132.0 


19.0 


noa3.de 


25.4 


4.7 


or. pa 


96.2 


9.4 


phia.as 


28.6 


4.3 


rep.mc 


24.9 


3.7 


rno.pa 


95.2 


20.0 


rwe.de 


95.0 


28.0 


san.mc 


14.6 


3.1 


san.pa 


62.0 


4.7 


sap.de 


34.5 


4.5 


sgo.pa 


62.4 


13.0 


sie.de 


107.1 


24.8 


su.pa 


91.1 


17.9 


tef.mc 


21.9 


4.5 


ucg.mi 


5.6 


0.7 









Table 1: Data from EuroStoxx 



rl 


max 


cuts 


basis 


reduction 


nodes 


optimum 






0.0015 


0.00154 


1 


6657 




165 


xg = 42, xi4 


= 4,Xi6 = 


29, 








(9 s.) 


(22 s.) 


(99 s.) 


X20 = 5, ^28 


= l,a;36 = 


8. 


0.0020 


0.00205 


1 


40256 




137 


Xg = 51 , 


= 5, Xig = 


19, 








(446 s.) 


(240 s.) 


(497 s.) 


3^20 = 1) ^2% 


= 1,3^36 = 


12. 


0.0025 


0.00256 


2 


27082 




32 


Xg = 59, Xi4 


= 6,Xi6 = 


13, 








(194 s.) 


(421 s.) 


(83 s.) 


2^28 = 2, X36 


= 10. 




0.0030 


0.00301 


1 


12504 




62 


Xg = 64, X8 = 


= 1, Xi4 = 1 


8, 








(26 s.) 


(142 s.) 


(81 s.) 


3^16 = l,a^28 


= l,a;36 = 


10,X37 = 1 


0.0035 


0.00351 


1 


2357 







xg = 68, Xi4 


= 10,Xi6 = 


= 1,X36 = 5. 








(1 s.) 


(4 s.) 


(0 s.) 








0.0040 


0.00430 





569 




9844 


xg = 74, xi4 


= 10,Xi6 = 


= 1, 








(0 s.) 


(6 s.) 


(821 s.) 


3^28 = 2, X36 


= 1. 






0.00404 


1 


11924 
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Xg = 74, xi4 


= 10,Xi6 = 


= 1, 








(32 s.) 


(121 s.) 


(10 s.) 


3^28 = 2, X36 


= 1. 




0.0045 


0.00451 


1 


7087 







Xg = 84, Xi4 


= 6,xig = 


1, 








(14 s.) 


(33 s.) 


(0 s.) 


X2% = 1- 






0.0050 


0.00508 





357 




6 


Xg = 91, Xi4 


= 3, X28 = 


1. 








(1 s.) 


(0 s.) 


(0 s.) 









Table 2: Discrete optimums for different values of risk Tq 
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The border risk is = 0.00035, and B = 6000. For the computations, all the prices and 
returns have been multiplied by 10 in order to work with integer variables. In this example we 
set the parameters to 

'^max - '"o < 10""^) MaxNumCuts = 4, MaxNumNodes = 10000 

In Table [2] we show the results of the application of our algorithm assuming different risk 
levels. Each risk level is organized in a block of two rows. The first one gives the corresponding 
information, and below we report on the elapsed time to obtain these elements. Column r^^^,^ 
contains the greatest risk reached in the polytope with the number of added tangents as shown 
by the following column. Column 'basis' denotes the number of elements of the computed test- 
set, and column 'processed' contains the number of new nodes found and explored. Finally, 
column 'optimum' is a feasible point with the best return. The time in seconds of these tasks 
appears in parenthesis under the columns 'basis', 'reduction' and 'nodes', respectively. 

It is worth remarking the case = 0.0040. The first row shows the number of processed 
nodes until an optimal point was reached, with no added cutting hyperplane. The second row 
gives us an example of the effectiveness of adding new cuts. The test-set is computed very 
quickly, although the number of elements is big. However, the number of processed points is 
very small, and hence it is also small the total elapsed time. 



/p'2 
max 


tang. 


basis 


reduction 


Swl 


Sw2 


nodes 


improvement 




0.00118 


3 


32495 
(244 s.) 


(333 s.) 
(176 s.) 



1 


1 



max 
(426 s.) 
88 
(0 s.) 


xq = 35, Xs = 36, Xi4 = 

XiQ = 27, X20 = 8,X28 = 
2^35 = 3, X36 = 3, X43 = 


= 31, 
1 


0.00107 


4 


16930 







1 


max 


xe = 34, = 22,xi4 = 


■2, 






(52 s.) 


(130 s.) 






(393 s.) 


X16 = 29, ^20 = 9,X28 = 
^35 = 3, = 3, X43 = 

reached at 2894 nodes 


= 24, 
1 

in 48 s. 


0.00107 


4 


18637 
(49 s.) 


(114 s.) 





1 


max 
(439 s.) 














1 





9759 


xq = 33, Xs = 29, Xi4 = 










(80 s.) 






(785 s.) 


3^16 = 31,X20 = 9,X28 = 
2^35 = 2, X36 = 3 


= 26, 


0.00105 


4 


14670 







1 


max 


Xq = 34, Xs = 32, Xi4 = 


^2, 






(28 s.) 


(79 s.) 






(627 s.) 


X16 = 29,X20 = 8,X28 = 
2^35 = 3, X36 = 4, X43 = 

reached at 1680 nodes 


= 10, 
2 

in 29 s. 


0.00101 


2 


2613 










2648 


Xq = 34, Xs = 32, xu = 


■2, 






(1 s.) 


(10 s.) 






(853 s.) 


xiQ = 29,X2o = 8,a;28 = 

2^35 = 3,X36 = 4,2:43 = 

optimum 


= 10, 
2 



Table 3: Steps in the computation for Tq = 0.0010 



Table [3] contains all the iterations done in the computation of the optimum for the case 
Tq = 0.0010. The column Swl refers to the variable SwFictBounds, and 5*^2 to SwNumNodes. 
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The first one is true when fictitious bounds are used, and the second one is true when the number 
of processed records is greater than MaxNumNodes = 10000. The column 'improvement' 
contains a new point found with better return than the initial point. Again, the time in seconds 
of each task appears in parenthesis. 

Example 5.2. This example is devoted to show the difference between the rounded continuous 
solution and the integer solution of a portfolio problem. It consists of a mixing of usual stocks 
(Microsoft and General Electric) with the value of a future contract based on oil, as in |Chn09] , 
so the number of values is n = 3. The initial data is given by 

Stock Price (aj) Return (/ij) 

MSFT 35.22 3.64 

GE 36.76 3.64 

Oil 4000 10000 



and the covariance matrix is 

/ 0.003250634 0.000654331 0.022513263 \ 
n = 0.000654331 0.001578359 -0.006610861 . 
\ 0.022513263 -0.006610861 26.35846804 / 

We fix the risk to Tq = 1.52, and compute the optimum for different budgets B. The test-set 
associated with the constraints 

a*£c < B,Re< n^x < [Rc\ , X G (Z+)'' 

has 2663 elements. The basis remains equal for all the considered cases. However, the capacity 
of computation is run out when only one more cut is added. If we take as initial point pe, the 
discrete approximation given by rounding, the tree searching was unable to reach the optimal 
point for MaxNumNodes = 50000. This fact is reported as 'E' in Table |4] 

The function ComputeDiscreteApprox should be changed to get a better discrete approxi- 
mation than the rounded value. If we take an approximation based on the best value for the 
future contract, we get Table HI 

It is easy to see the enormous difference between the return of the discrete approximation 
and the corresponding return of the discrete optimum for each budget. 

6 Conclusions 

We have presented an algorithm to deal with portfolio problems with integer variables and non- 
linear constraints. The presented model was not previously treated as far as we know. The 
method is based on the computation of some test sets using Grobner bases, an algebraic tool. 
These Grobner bases are computed from a linear integer subproblem that contains the original 
linear constraints and some new cuts induced by the non linear constraints. The reversal test- 
set, given by the Grobner basis, allows us to perform a dual search algorithm from the optimal 
solution of the linear subproblem towards the optimal solution of the whole portfolio problem. 
This technique has allowed us to solve problems of size similar to the exposed in |CF07l ILT08j . 
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continuous 



Budget 


optimum 


return 


nodes 


optimum 


Rd 


50000 


(1079.87,0,2.99) 
discrete approx. 


33848.34 










(1192,0,2) 


24338.88 


E 








(219,824,3) 


33796.52 


6066 


(314, 705,3) 


33815.06 


75000 


(1619.80,0,4.49) 
discrete approx. 


50772.52 










(1675,0,4) 


46097.00 


22790 


(1675,0,4) 


46097.00 


100000 


(2159.74,0,5.98) 
discrete approx. 


67696.69 










(2271,0,5) 


58266.44 


E 








(439,1646,6) 


67596.69 


22991 


(687, 1409,6) 


67629.44 



Table 4: Mixed example 
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Appendix 



Procedure NewPolytope(polytope P, matrix Q, point Pe, Tol, tq, MaxNumCuts) 
NumCuts = ; 

Pmax,?^^ = ComputeMaxRisk(P, Q) ; 

/* solve the problem max Q (a;), a; G P. */; 
while r^^ — Tq > Tol and NumCuts < MaxNumCuts do 

S = Pe + A(pmax " Pe), A > ; 

p' = snQ ; 

V = TangentToQuadric((5,p') ; 
DirApprox = Round(f, Prec) ; 

/* round with number of digits = Prec */; 
Coef = TangentloQuadricY {DirApprox, Q) ; 

/* independent term of the tangent hyperplane Q and normal vector equal 
to DirApprox */ ; 
Coef = Ceil(Coe/) ; 

/* the best integer to leave the quadric in a half -space */ ; 

H := DirApprox^ X — Coef ; 

/* new linear cut */ ; 

NumCuts = NumCuts + 1; 
P = Polytope(P,iJ) ; 

/* add a new cut to polytope P */ ; 
Pmax,r-^ = ComputeMaxRisk(P, (5) ; 
end 

return P 
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Algorithm 1: DiscreteOptimum 



Data: budget B, risk rg, matrix Q, vector a, vector fj,, MaxNumCuts, 

MaxNumNodes, Tol 
Result: Optimum, NumNodesProc 

Pc = ComputeContinuousOptimuni(i?, Tq, Q, a, fJ,),gc = A^^Pcj « = 1/2 i 
Pe, Qe = ConiputeDiscreteApprox(pc; B, a, Tq, Q) ; 

SwEOP = false, SwFictBounds = false, Swimprove = false, SwNumNodes = 
false, ListO f Variables = (1 : dim) ; 
while not SwEOP do 

if not SwFictBounds then 

I b, ListO /Variables = CorwputeLo\ierBouiids(ListOf Variables, Pe) ; 
end 

P = Polytope(a*a:; < B, fi^x < gc, n^x > gd, x > b) ; 

NumNodesProc = ; 

while not ( SwEOP or Swimprove or SwNumNodes ) do 
P = NewPolytope(P, Q,Pe, Tol, tq, MaxNumCuts) ; 
G = ConiputeTestSet(P),pini = Reduce(pbounds, C) ; 
SwNumNodes, Swimprove, Optimum = 

TreeSearch(pini, G, Q, a, B, Tq, b, Pe, MaxNumNodes, SwFictBounds) 
if SwFictBounds then 
if not Swimprove then 
I SwEOP = true ; 
else 

I Pe = Optimum, g^ = ^^Optimum, SwFictBounds = false ; 
end 



else 



if not SwNumNodes then 
I SwEOP = true ; 
else 

if not Swimprove then 
I b = b + OL{pe — b), SwFictBounds = true ; 
end 
end 



end 



end 



end 
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